# of exons in each CC
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1.000 1.000 1.000 2.151 1.000 86.000
# of CC total: 11556
# of CC w/ one exon: 9997
# of CCs by # of exons
mean exon expr. by # of exons
median exon expr. by # of exons
## [1] TRUE
location on chr19
median vs. MAD expr. of single exon CCs
There were 400 unique UCSC genes that overlapped with 946 unique CCs. Total number of overlaps was 964.
We now add UCSC KnownGenes to above tracks plots. Note that these are not real transcripts, just the ‘union’ transcripts for each gene constructed by joining all reported isoforms. (note: need to specify fixed() <- TRUE for ucsc track since plot includes ideogram.)
Next, we focus on SPTBN4 which overlapped with 16 single exon CCs.
##
## uc002onx.3 uc002ony.3 uc002onz.3 uc002ooa.3 uc010egx.3 uc010egy.1
## 11 16 16 1 9 1
## uc031rks.1
## 5
# of exons in each CC
## exon_cnts
## 1 2 3 4 5 6
## 9997 381 117 97 88 96
# of exons/junctions in each CC
## ej_cnts
## 1 2 3 4 5 6
## 9997 253 151 11 79 16
some CCs with 2 exons but no junctions
## [1] "gene10021" "gene10052" "gene10061" "gene10084" "gene10146"
## [6] "gene10276" "gene103" "gene10358" "gene1038" "gene10492"
## gIdx gStart gStop kind start stop
## 34760 gene10021 54007583 54007739 e 54007583 54007677
## 34761 gene10021 54007583 54007739 e 54007678 54007739
compare UCSC KLK12 models vs. CCs in region
include nearby genes (KLK9, KLK10, KLK11, KLK12, KLK13)
most likely connected component was gene9317
equal width exons
adding annotations
looking at clustering along the exons at KLK12
exon PCA decomposition (scores)
exon PCA decomposition (loadings)
look at clusters
scores by clusters
expression by clusters
look at clusters
scores by clusters
expression by clusters
## user system elapsed
## 62.634 2.911 67.470
## [1] "2014-11-04 15:51:58 EST"
## R version 3.1.1 (2014-07-10)
## Platform: x86_64-apple-darwin13.3.0 (64-bit)
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] grid parallel stats graphics grDevices utils datasets
## [8] methods base
##
## other attached packages:
## [1] mvnmle_0.1-11
## [2] igraph_0.7.1
## [3] PTAk_1.2-9
## [4] tensor_1.5
## [5] org.Hs.eg.db_2.14.0
## [6] RSQLite_1.0.0
## [7] DBI_0.3.1
## [8] TxDb.Hsapiens.UCSC.hg19.knownGene_2.14.0
## [9] BSgenome.Hsapiens.UCSC.hg19_1.3.1000
## [10] annotate_1.42.1
## [11] SplicingGraphs_1.4.1
## [12] Rgraphviz_2.8.1
## [13] graph_1.42.0
## [14] GenomicAlignments_1.0.6
## [15] BSgenome_1.32.0
## [16] Rsamtools_1.16.1
## [17] Biostrings_2.32.1
## [18] XVector_0.4.0
## [19] GenomicFeatures_1.16.3
## [20] AnnotationDbi_1.26.1
## [21] Biobase_2.24.0
## [22] GenomicRanges_1.16.4
## [23] GenomeInfoDb_1.0.2
## [24] IRanges_1.22.10
## [25] RColorBrewer_1.0-5
## [26] ggbio_1.12.10
## [27] BiocGenerics_0.10.0
## [28] ggplot2_1.0.0
##
## loaded via a namespace (and not attached):
## [1] acepack_1.3-3.3 base64enc_0.1-2
## [3] BatchJobs_1.5 BBmisc_1.8
## [5] BiocParallel_0.6.1 biomaRt_2.20.0
## [7] biovizBase_1.12.3 bitops_1.0-6
## [9] brew_1.0-6 checkmate_1.5.0
## [11] cluster_1.15.3 codetools_0.2-9
## [13] colorspace_1.2-4 compiler_3.1.1
## [15] dichromat_2.0-0 digest_0.6.4
## [17] evaluate_0.5.5 fail_1.2
## [19] foreach_1.4.2 foreign_0.8-61
## [21] formatR_1.0 Formula_1.1-2
## [23] gridExtra_0.9.1 gtable_0.1.2
## [25] Hmisc_3.14-5 htmltools_0.2.6
## [27] iterators_1.0.7 knitr_1.7
## [29] labeling_0.3 lattice_0.20-29
## [31] latticeExtra_0.6-26 MASS_7.3-35
## [33] munsell_0.4.2 nnet_7.3-8
## [35] plyr_1.8.1 proto_0.3-10
## [37] Rcpp_0.11.3 RCurl_1.95-4.3
## [39] reshape2_1.4 rmarkdown_0.3.3
## [41] rpart_4.1-8 rtracklayer_1.24.2
## [43] scales_0.2.4 sendmailR_1.2-1
## [45] splines_3.1.1 stats4_3.1.1
## [47] stringr_0.6.2 survival_2.37-7
## [49] tools_3.1.1 VariantAnnotation_1.10.5
## [51] XML_3.98-1.1 xtable_1.7-4
## [53] yaml_2.1.13 zlibbioc_1.10.0